NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Generating Distractors for Code Completion Problems: Can LLM Assist Instructors?

https://doi.org/10.32473/flairs.38.1.138995

Hassany, Mohammad; Akhuseyinoglu, Kamil; Lekshmi_Narayanan, Arun Balajiee; Agarwal, Arav; Savelka, Jaromir; Brusilovsky, Peter (May 2025, The International FLAIRS Conference Proceedings)

Code completion problems are an effective type of formative assessment; especially, when used to practice newly learned concepts or topics. While there is a growing body of research in computing education on the use of large language models (LLMs) to support learning content development, the use of LLMs for producing high-quality code completion problems has not yet been explored. In this paper, we analyze the capability of LLMs to generate effective distractors (i.e., plausible but incorrect options) and explanations for completion problems. We utilize common student misconceptions to improve the quality of the generated distractors. Our study suggests that LLMs are capable of generating reasonable distractors and explanations. At the same time, we identify a lack of a sufficiently granular taxonomy of common student misconceptions that would be needed for aligning the generated distractors with the common misconceptions and errors -- a gap that should be addressed in future work.
more » « less
Free, publicly-accessible full text available May 14, 2026
Generating Effective Distractors for Introductory Programming Challenges: LLMs vs Humans

https://doi.org/10.1145/3706468.3706529

Hassany, Mohammad; Brusilovsky, Peter; Savelka, Jaromir; Lekshmi_Narayanan, Arun Balajiee; Akhuseyinoglu, Kamil; Agarwal, Arav; Hendrawan, Rully Agus (March 2025, ACM)

As large language models (LLMs) show great promise in generating a wide spectrum of educational materials, robust yet cost-effective assessment of the quality and effectiveness of such materials becomes an important challenge. Traditional approaches, including expert-based quality assessment and student-centered evaluation, are resource-consuming, and do not scale efficiently. In this work, we explored the use of pre-existing student learning data as a promising approach to evaluate LLM-generated learning materials. Specifically, we used a dataset where students were completing the program construction challenges by picking the correct answers among human-authored distractors to evaluate the quality of LLM-generated distractors for the same challenges. The dataset included responses from 1,071 students across 22 classes taught from Fall 2017 to Spring 2023. We evaluated five prominent LLMs (OpenAI-o1, GPT-4, GPT-4o, GPT-4o-mini, and Llama-3.1-8b) across three different prompts to see which combinations result in more effective distractors, i.e., those that are plausible (often picked by students), and potentially based on common misconceptions. Our results suggest that GPT-4o was the most effective model, matching close to 50% of the functional distractors originally authored by humans. At the same time, all of the evaluated LLMs generated many novel distractors, i.e., those that did not match the pre-existing human-authored ones. Our preliminary analysis shows that those appear to be promising. Establishing their effectiveness in real-world classroom settings is left for future work.
more » « less
Free, publicly-accessible full text available March 3, 2026
Engaging an LLM to Explain Worked Examples for Java Programming: Prompt Engineering and a Feasibility Study

Hassany, Mohammad; Brusilovsky, Peter; Ke, Jiaze; Akhuseyinoglu, Kamil; Lekshmi-Narayanan, Arun-Balajiee (July 2024, CEUR Workshop Proceedings)

Worked code examples are among the most popular types of learning content in programming classes. Most approaches and tools for presenting these examples to students are based on line-by-line explanations of the example code. However, instructors rarely have time to provide line-by-line explanations of a large number of examples typically used in a programming class. This paper explores the opportunity to facilitate the development of worked examples for Java programming through a human-AI collaborative authoring approach. The idea of collaborative authoring is to generate a starting version of code explanations using LLM and present it to the instructor to edit if necessary. The critical step towards implementing this idea is to ensure that LLM can produce code explanations that look meaningful and acceptable to instructors and students. To achieve this goal, we performed an extensive prompt engineering study and evaluated the explanation produced by the selected prompt in a user study with students and authors.
more » « less
Full Text Available
Authoring Worked Examples for JAVA Programming with Human AI Collaboration

https://doi.org/10.1145/3605098.3636160

Hassany, Mohammad; Ke, Jiaze; Brusilovsky, Peter; Lekshmi_Narayanan, Arun Balajiee; Akhuseyinoglu, Kamil (April 2024, Proceedings of ACM/SIGAPP Symposium on Applied Computing, SAC 2024)

Worked examples are among the most popular types of learning content in programming classes. However, instructors rarely have time to provide line-by-line explanations for a large number of examples typically used in a programming class. In this paper, we explore and assess a human-AI collaboration approach to authoring worked examples for Java programming. We introduce an authoring system for creating Java worked examples that generate a starting version of code explanations and presents it to the instructor to edit if necessary. We also present a study that assesses the quality of explanations created with this approach.
more » « less
Full Text Available
Human-AI Co-Creation of Worked Examples for Programming Classes

Hassany, Mohammad; Brusilovsky, Peter; Ke, Jiaze; Akhuseyinoglu, Kamil; Lekshmi_Narayanan, Arun_Balajiee (March 2024, Proceedings of 5th Workshop on Human-AI Co-Creation with Generative Models (HA-GEN 20224) at IUI 2024)

Worked examples (solutions to typical programming problems presented as a source code in a certain language and are used to explain the topics from a programming class) are among the most popular types of learning content in programming classes. Most approaches and tools for presenting these examples to students are based on line-by-line explanations of the example code. However, instructors rarely have time to provide line-by-line explanations for a large number of examples typically used in a programming class. In this paper, we explore and assess a human-AI collaboration approach to authoring worked examples for Java programming. We introduce an authoring system for creating Java worked examples that generates a starting version of code explanations and presents it to the instructor to edit if necessary. We also present a study that assesses the quality of explanations created with this approach.
more » « less
Full Text Available
Explaining Code Examples in Introductory Programming Courses: LLM vs Humans

Lekshmi-Narayanan, Arun-Balajiee; Oli, Priti; Chapagain, Jeevan; Hassany, Mohammad; Banjade, Rabin; Brusilovsky, Peter; Rus, Vasile (February 2024, Workshop on AI for Education - Bridging Innovation and Responsibility at AAAI 2024)

Worked examples, which present an explained code for solving typical programming problems are among the most popular types of learning content in programming classes. Most approaches and tools for presenting these examples to students are based on line-by-line explanations of the example code. However, instructors rarely have time to provide explanations for many examples typically used in a programming class. In this paper, we assess the feasibility of using LLMs to generate code explanations for passive and active example exploration systems. To achieve this goal, we compare the code explanations generated by chatGPT with the explanations generated by both experts and students.
more » « less
Full Text Available

Search for: All records